Picture for Zhixuan Chu

Zhixuan Chu

School of Cyber Science and Technology, Zhejiang University

ConsisGuard: Aligning Safety Deliberation with Policy Enforcement in LLM Guardrails

Add code
May 29, 2026
Viaarxiv icon

Make LLM Learn to Synthesize from Streaming Experiences through Feedback

Add code
May 28, 2026
Viaarxiv icon

Inducing Overthink: Hierarchical Genetic Algorithm-based DoS Attack on Black-Box Large Language Reasoning Models

Add code
May 14, 2026
Viaarxiv icon

Optimal Transport for LLM Reward Modeling from Noisy Preference

Add code
May 07, 2026
Viaarxiv icon

Robust Reward Modeling for Large Language Models via Causal Decomposition

Add code
Apr 16, 2026
Viaarxiv icon

JANUS: A Lightweight Framework for Jailbreaking Text-to-Image Models via Distribution Optimization

Add code
Mar 22, 2026
Viaarxiv icon

Deep Autocorrelation Modeling for Time-Series Forecasting: Progress and Prospects

Add code
Mar 20, 2026
Viaarxiv icon

CausalRM: Causal-Theoretic Reward Modeling for RLHF from Observational User Feedbacks

Add code
Mar 19, 2026
Viaarxiv icon

Explainable Token-level Noise Filtering for LLM Fine-tuning Datasets

Add code
Feb 16, 2026
Viaarxiv icon

Why Steering Works: Toward a Unified View of Language Model Parameter Dynamics

Add code
Feb 02, 2026
Viaarxiv icon